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GENERALIZED PANORAMIC MOSAIC 



FIELD OF THE INVENTION 

This invention relates to video image mosaicing for obtaining 
panoramic mosaics of a scene. 

PRIOR ART 

Prior art references considered to be relevant as a background to 
the invention are listed below. Acknowledgement of the references herein shall 
not be inferred as meaning that these are in any way relevant to the 
patentability of the invention disclosed herein. Each reference by a number 
enclosed in square brackets and accordingly the prior art will be referred to 
throughout the specification by numbers enclosed in square brackets. 

[1] ARPA Image Understanding Workshop, Monterey, California, 
November 1 994, Morgan Kaufmann. 

[2] Fifth International Conference on Computer Vision, Cambridge, 
MA, June 1995, IEEE-CS. 

[3] IEEE Conference on Computer Vision and Pattern Recognition, San 
Francisco, California, June 1996. 

[4] P.J. Burt and E.H. Adelson. A multiresolution spline with applica- 
tion to image mosaics. ACM Trans, on Graphics, 2£4), pages 
217-236, October 1983. 
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[5] PJ. Burt and P. Anandan. Image stabilization by registration to a 
reference mosaic. In ARPA Image Understanding Workshop [1], 
pages 425-434. 

[6] S.E. Chen and L. Williams. View interpolation for image synthesis. 
In SIGGRAPH, pages 279-288, Anaheim, California, August 1993, 
ACM. 

[7] T.R, Halfhill. See you around. Byte Magazine, pages 85-90, May 
1995. 

[8] M. Hansen, P. Anandan, K. Dana, G. van der Wal, and P.J. Burt. 
Real-time scene stabilization and mosaic construction. In ARPA 
Image Understanding Workshop [1], pages 457-465. 

[9] M. Irani, P. Anandan, and S. Hsu. Mosaic based representations of 
video sequences and their applications. In Fifth International 
Conference on Computer Vision [2], page 605-611. 

[10] P. Jaillon and A. Montanvert. Image mosaicing applied to 
three-dimensional surfaces. In 12 International Conference on 
Pattern Recognition, pages 253-257, Jerusalem, Israel, October 
1994,IEEE-CS. 

[11] A. Krishnan and N. Ahuja. Panoramic image acquisition. In IEEE 
Conference on Computer Vision and Pattern Recognition [3], pages 
379-384. 

[12] S. Mann and R. Picard. Virtual bellows: Constructing high quality 
stills from video. In First IEEE International Conference on Image 
Processing, Austin, Texas, November 1994. 

[13] L. McMillan and G. Bishop. Plenoptic modeling: An image-based 
rendering system. In SIGGRAPH, Los Angeles, California, August 
1995, ACM. 

[14] D.L. Milgram. Computer methods for creating photomosaics. IEEE 
Trans, on Computers, C-24, pages 1 1 13-1 1 19, 1975. 

[15] D.L. Milgram. Adaptive techniques for photomosaicing. IEEE 
Trans, on Computers, C-26 . pages 1175-11 80, 1977. 

[16] S. Peleg, Elimination of seams from photomosaics. Computer 
Graphics and Image Processing, 16, pages 90-94, May 198 1 . 
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[17] B. Rousso, S. Avidan, A. Shashua, and S. Peleg. Robust recovery of 
camera rotation from three frames. In IEEE Conference on 
Computer Vision and Pattern Recognition [3], pages 796-802. 

5 [18] H.S. Sawhney, S; Ayer, and M. Gorkani. Model-based 2D and 3D 
dominant motion estimation for mosaicing and video representation. 
In Fifth International Conference on Computer Vision [2], pages 
583-590. 

10 [19] S. Seitz and C. Dyer. Physically valid view synthesis by image 
interpolation. In Proc. IEEE Workshop on Representation of Visual 
Scenes, Cambridge, MA, June 1995, IEEE-CS. 

[20] R. Hartley and R. Gupta. Linear pushbroom cameras. In J.O. 
15 Eklundh, editor, Third European Conference on Computer Vision, 

pages 555-566, Stockholm, Sweden, May 1994, Springer. 

[21] M. Irani, B. Rousso, and S. Peleg. Detecting and tracking multiple 
moving objects using temporal integration. In G. Sandini, editor, 
20 Second European Conference on Computer Vision, pages 282-287, 

Santa Margherita, Italy, May 1992, Springer. 

[22] S. Peleg, J. Herman, D. Dixon, P. J. Burt, and J. R. Bergen. U.S. 
Patent Application - Improved methods for mosaic image 
IS construction. 

[23] R. Szeliski. Video mosaics for virtual environments. IEEE 
Computer Graphics and Applications, pages 22-30, March 1996. 

30 [24] R. Szeliski and S.B. Kang. Direct methods for visual scene recon- 
struction. In Proc. IEEE Workshop on Representation of Visual 
Scenes, Cambridge, MA, June 1995, IEEE-CS, pages 26-33. 

[25] J. Y. Zheng and S. Tsuji. Panoramic representation for route recogni- 
35 tion by a mobile robot. International Journal of Computer Vision, 

Vol. 9, pages 55-76, 1992. 
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BACKGROUND OF THE INVENTION 

The need to combine pictures into panoramic mosaics existed since 
the beginning of photography, since the camera's field of view is always smaller 
than the human field of view. Also, very often large objects cannot be captured in 
5 a single picture, and only photo-mosaicing enables a more complete view. Digital 
photography created new applications for mosaicing [14, 15, 16, 4, 24, 23], which 
were first implemented for aerial and satellite images. 

Three major issues are important in traditional image mosaicing: 

(i) Image alignment, which determines the trans- 
it) formation that aligns the images to be combined into a mosaic. Paper 

photo-mosaicing uses rigid transformations for alignment: picture translations 
(shifts) and rotations. Digital processing enables more general transformations, 
like affine or planar-projective. 

(ii) Image cut and paste is necessary since most regions 
15 in the panoramic mosaic are overlapping, and are covered by more than one 

picture. The cut and paste process involves either a selection of a single image 
for each overlapping region, or some kind of a combination of all overlapping 
images. 

(iii) Image blending is necessary to overcome the 
20 intensity difference between images, differences that are present even when 

images are perfectly aligned. Such differences are created by a dynamically 
changing camera gain. 

The simplest mosaics are created from a set of images whose 
mutual displacements are pure image-plane translations. This is approximately 
25 the case with some satellite images. Such translations can either be computed by 
manually pointing to corresponding points, or by image correlation methods. 
Other simple mosaics are created by rotating the camera around its optical center 
using a special device, and creating a panoramic image which represents the 
projection of the scene onto a cylinder [7, 1 1, 12, 13] or a sphere. Since it is not 
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simple to ensure a pure rotation around the optical center, such mosaics can be 
used only in limited cases. 

In more general camera motions, which may include both camera 
translations and camera rotations, more general transformations for image 

5 alignment are used [5, 8, 9, 10, 18]. In most cases images are aligned pairwise, 
using a parametric transformation like an afifine transformation or 
planar-projective transformation. These transformations include an intrinsic 
assumption regarding the structure of the scene, such as being planar. A reference 
frame is selected, and all images are aligned with this reference frame and 

10 combined to create the panoramic mosaic. These methods are therefore referred 
to as reference frame based methods. 

Aligning all frames to a single reference frame is reasonable when 
the camera is far away and its motion is mainly a sideways translation and a 
rotation around the optical axis. Significant distortions are created when camera 

15 motions include other rotations. Fig. 1 shows the effects of large rotations on 
reference frame based methods. The objects a, b, x, y, c, d, w, z are viewed from 
two cameras C\ and C 2 . The image l\ is selected to be a reference frame and 
image I 2 is projected onto that reference frame. Large rotations generate 
distortions when projecting on the reference frame, and the information derived 

20 from frames with such rotations is blurred, and almost useless. Moreover, in long 
sequences in which the camera is traveling in' a complex path, one frame can not 
be used for long as a reference frame, and projection of the entire sequence onto 
that frame becomes impractical. 

Recently, a method, called "Manifold Projection" [22], has been 

25 proposed to create mosaics in more general cases. This method performs 
alignments using only image-plane translations and rotations, constructs the 
mosaic from the center-most parts of the images, and merges the images into a 
seamless panorama. The manifold projection method is very similar to the one in 
[25], where a mosaic is constructed by scanning a scene with a one-dimensional, 
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straight array. However, while in [25] camera motion is measured by an external 
device, in [22] the camera motion is measured from the images in the sequence. 

However, none of the above methods can handle cases where 
images cannot be aligned due to parallax, or cases of zoom and forward motion. 

5 Manifold Projection simulates the sweeping of a scene using a 

linear one-dimensional sensor array, see Fig. 2. Such a one-dimensional sensor 
can scan the scene by arbitrary combinations of rotations and translations, and in 
all cases the scanning will result in a sensible panoramic image if it could be 
figured out how to align the incoming one-dimensional image strips. Some 

10 satellite images are created by scanning the earth with a one-dimensional sensor 
array using a rotating mirror. Since in this case the alignment of the sensors can 
be done using the location of the satellite and the position of the mirror, 
panoramic two-dimensional images are easily obtained. Fig. 2 shows aerial 
photography with a linear one-dimensional scan system. 

15 In more general cases the motion of the sweeping plane may not be 

known. It seems impossible to align the one-dimensional image strips coming 
from an arbitrary plane sweep, but the problem becomes easier when the input is 
a video sequence. A two-dimensional frame in a video sequence can be regarded 
as having a one-dimensional strip somewhere in the center of the image ("center 

20 strip"), embedded in the two-dimensional image to facilitate alignment. The 
motion of the sweeping plane can then be computed from the entire image, and 
applied on the center-strip for alignment and mosaicing. 

The image transformations of the one-dimensional strips generated 
by the sweeping plane are only rigid transformations: image plane translations 

25 and rotations. Therefore, rigid transformations are also the transformations used 
in manifold projection. It should be noted that general camera motions induce, in 
general, non-rigid image-plane transformations. However, to simulate the plane 
sweep only rigid transformations are used for the center-strip. 
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The panoramic mosaic generated by combining the aligned 
one-dimensional center-strips forms the manifold projection. This is a projection 
of the scene into a general manifold, which is a smooth manifold passing through 
the centers of all image planes constructing the mosaic. In the case of pure 

5 camera translations (Fig. 3a), manifold projections turn out to be a parallel 
projection onto a plane. In the case of pure camera rotations (Fig. 3b), it is a 
projection onto a cylinder, whose principal axis is the rotation axis. But when 
both camera translations and rotations are involved, as in Fig. 3c, the manifold is 
not a simple manifold any more. In Figs. 3a, 3b and 3c the camera is located at 

10 the tip of the "field-of-view" cone, and the image plane is marked by a solid 
segment. The ability to handle such arbitrary combinations of camera rotations 
and translations is the major distinction between manifold projection and all 
previous mosaicing approaches. 

In view of the foregoing, it should be apparent that there exists a 

15 need to provide a method for the creation of panoramic image mosaics in cases 
not treated in the prior art. Such cases involve camera translations with image 
parallax; forward motion; camera motions that are combinations of translations 
and rotations; and camera zoom. 

20 SUMMARY OF THE INVENTION 

It is important to note that whenever the terms "video", "movie", 
"frame", "picture", or "image" are used, they refer to any representation of a 
picture or a movie (motion picture). A still picture can be recorded on film by a 
traditional camera, by a digital camera, by a scanner, or any other device that 
25 records still images. A video (or a motion picture) can be recorded by a film 
camera, an analog or a digital videotape, or any other device that records 
motion pictures. The area of image mosaicing in general, and this invention in 
particular, is applicable to all forms of images which can be manipulated by 
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appropriate devices, whether mechanical, optical, digital, or any other 
technology. 

Panoramic mosaics are constructed by combining strips from the 
image sequence. In accordance with the present invention, the shape, size and 
5 position of the strips are determined for each image in accordance with the type 
of camera motion. The strips are cut from the images, and pasted into the 
panoramic mosaic after being transformed, such that the resulting mosaic remains 
continuous. 

In accordance with the present invention, the following constraints 
10 are preferably (but not necessarily) used in order to deal with general image plane 
transformations: 

(a) the strips should be approximately perpendicular to the optical flow. 

(b) the strips collected for pasting should be warped before pasting into 
the panoramic image so that after warping their original optical 

15 flow, it becomes approximately parallel to the direction in which the 

panoramic image is constructed. 

Under these conditions, cases of zoom and forward motion can be 
handled as well as the other simple cases. For example, in the case of zoom or 
forward motion, these properties enable cutting circular strips, and proper 

20 bending of them before pasting into the panoramic image. 

This invention also describes how to determine the width of the 
strips. For example, in order to handle image parallax properly, the size of the 
strips can be determined from the camera's three-dimensional motion, as can be 
computed from the sequence itself, or as can be measured by external devices. 

25 To enable smooth mosaics even when frames to be combined are 

taken from different viewpoints, and have substantial parallax, views can be 
synthesized for in-between camera positions. For smoothest mosaics the number 
of in-between camera positions is selected such that the strip is narrow, e.g. 
having a width of a single pixel. 
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The present invention provides for a method for combining a 
sequence of two-dimensional images of a scene to obtain a panoramic mosaic of 
said scene, said sequence of two-dimensional images being acquired by a moving 
camera in relative motion with respect to said scene, said camera having an 
5 optical center, the camera motion giving rise to optical flow between the images, 
the method comprising the step of wrapping the images; 
pasting the images into the panoramic image, 

such that the optical flow becomes substantially parallel to the direction in 
which the mosaic is constructed. 

10 The invention still further provides for combining a sequence of 

two-dimensional images of a scene to obtain a panoramic mosaic of said scene, 
said sequence of two-dimensional images being acquired by a moving camera in 
relative motion with respect to said scene, said camera having an optical center, 
the camera motion giving rise to optical flow between the images, the method 

1 5 comprising the steps of: 

(a) selecting for each image of said sequence at least one strip such that 
each strip is substantially perpendicular to said optical flow; said strips having a 
front edge and a back edge with the optical flow entering a strip through the front 
edge and exiting the strip through the back edge; and 

20 (b ) pasting together said strips from adjacent to construct a panoramic 

mosaic. 

By one embodiment the method further comprises the step of: 
(a') wrapping the front edge of a strip defined on a two-dimensional so 
that it is substantially aligned with the back edge of a strip defined on an adjacent 
25 two-dimensional image. 

By another embodiment the strips are transformed by warping into 
strips having edges of arbitrary shape before the strips are pasted together. 

By yet another embodiment the strips are transformed by warping 
into strips having straight edges before the strips are combined together. 
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According to yet another embodiment the two-dimensional images 
are related by an affine transformation or by a planar-projective transformation. 

According to another embodiment said images are projected onto a 
three-dimensional cylinder whose major axis approximates the path of the camera 
5 centers of said images, the combination of the strips is achieved by translating the 
projected two-dimensional images substantially along the cylindrical surface of 
the three-dimensional cylinder. 

According to yet another embodiment every two subsequent 
images define their own cylinder whose major axis substantially passes through 
10 the centers of the cameras of said images, and the cylinders are concatenated 
substantially along the image sequence. 

According to still another embodiment a transformation is applied 
to the panoramic mosaic depending on a desired viewpoint. 

According to a further embodiment wherein the sequence of 
15 images is augmented by sets of interpolated images intermediate to the images of 
the sequence of images, and wherein the strips are augmented with strips defined 
on the interpolated images. 

According to another embodiment the system further combines a 
sequence of two-dimensional images of a scene to obtain a panoramic mosaic of 
20 said scene, said sequence of two-dimensional images being acquired by a moving 
camera in relative motion with respect to said scene, said camera having an 
optical center, the camera motion giving rise to optical flow between the images, 
the system comprising: 

wrapper for wrapping the images; 
25 paster for pasting the images into the panoramic image, such that the 

optical flow becomes substantially parallel to the direction in which the mosaic is 
constructed. 

Still further, the invention provides for combining a sequence of 
two-dimensional images of a scene to obtain a panoramic mosaic of said scene, 
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said sequence of two-dimensional images being acquired by a moving camera in 
relative motion with respect to said scene, said camera having an optical center, 
the camera motion giving rise to optical flow between the images, the system 
comprising: 

(a) selector for selecting for each image of said sequence at least one strip 
such that each strip is substantially perpendicular to said optical flow; said strips 
having a front edge and a back edge with the optical flow entering a strip through 
the front edge and exiting the strip through the back edge; and 

(b ) paster for pasting together said strips from adjacent images in such a 
way that the front edge of a strip defined on an image is substantially aligned with 
the back edge of a strip defined on an adjacent image. 

Still yet further the invention provides a memory containing a file 
representing a panoramic mosaic of a scene. 

The process described herein can alternatively be interpreted using 
three-dimensional projections of the images onto cylinders ("pipes") whose 
principal axis is the direction of camera motion. Such projections create warpings 
of the images such that the optical flow becomes parallel. 

BRIEF DESCRIPTION OF THE DRAWINGS 

For a better understanding the invention will now be described, by 
way of example only, with reference to the accompanying drawings in which: 

Fig. 1 shows the effects of large rotations on reference frame based 
methods; 

Fig. 2 shows aerial photography with a one-dimensional scan system; 

Fig. 3a shows manifold projection for a camera performing pure 
translation, the projection is a parallel projection onto a plane; 

Fig. 3b shows manifold projection for a camera performing pure rotation, 
the projection is onto a cylindrical manifold; 
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Fig, 3c shows manifold projection for a camera performing both 
translation and rotation, the projection is onto a manifold not having a simple 
geometrical form; 

Fig. 4 shows a general flow chart of the principle steps of the panorama 
5 production process of the invention; 

Fig. 5 shows the effects of parallax on the alignment and merging 
processes; 

Fig. 6 shows a mosaic built from images taken by a camera in sideways 
motion using vertical linear strips perpendicular to the camera's optical axis and to 
10 the optical flow which is from right to left as the camera translates from left to 
right; 

Fig. 7a shows a mosaic built from images taken by a camera in forward 
motion with translation along the optical axis of the camera, and optionally with 
zoom; the optical flow is radial from the center of the image to the outside, and 
15 the strips are circular; 

Fig. 7b shows the result of applying to an entire image the transformation 
that "bends" the strips; 

Fig. 8 shows a mosaic built from images taken by a camera in translation 
from left to right along a line making an intermediate angle (between 0 and 90 
20 degrees) with the optical axis; the optical flow is radial from the focus of 
expansion which is located to the right of the image, and the strips are circular or 
elliptic arcs; 

Figs. 9 show the shape of strips for different cases of affine motion: 
Fig. 9a a straight vertical strip for horizontal motion; 
25 Fig. 9b a straight horizontal strip for vertical motion; 

Fig. 9c a circular strip for forward motion; 
Fig. 9d an elliptical strip for general motion; 

Figs. 10 show an example of cutting and pasting strips for the case of 
affine motion: 
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Figs. lOa-c shows strips that are perpendicular to the optical flow. Line 
F2 is selected in Image 12 and Line F3 is selected in Image 13. The mapping is of 
Line F3 (in 13) into Image 12 using the same affine transformation is Line F3\ 
The strip S2 taken from Image 12 is bound between lines F2 and F3'; 
5 Fig, lOd shows strips that are warped and pasted so that the optical flow 

becomes parallel, their back is fixed (e.g. F2 in strip S2) and their front (e.g. F3' 
in strip S2) is warped to match the back of the next strip; 

Fig. 11 shows the projection of an image onto a pipe in order to achieve 
parallel optical flow; 

10 Fig. 12a shows the selection of strips from different images according to 

the resolution obtained from each image when projecting the images onto a pipe; 
Fig. 12b shows the concatenation of pipes in the case of complex camera 

path; 

Fig. 13 shows the choice of strip width required to preserve the original 
1 5 resolution for the case of pure rotation; and 

Fig. 14 shows the choice of strip width required to preserve the original 
resolution for the case of pure translation. 

Fig. 15 shows the generation of a panoramic image using view 
interpolation by generating synthetic views from intermediate camera positions 
20 and by taking narrow strips from each intermediate view to construct the mosaic; 
for either the case of translation, Pi, or for the case of rotation, P 2 ; 

Fig. 16 shows the generation of consistent panoramic mosaics in the 
presence of parallax; 

25 DETAILED DESCRIPTION OF THE INVENTION 

Attention is first drawn to Fig. 4 showing a general flow chart of 
the principle steps of the panorama production process of the invention. Motion 
recovery is performed at step 401. This step can use the images as well as any 
external motion information. New views synthesis is performed at step 402. This 
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step can use the input images, and motion information. Determining the strip size 
is performed at step 403. This step can also use the motion information. 
Determining the strip shape is performed at step 404. The cut and paste process 
is performed at step 405. This step can use the input images, the synthetic images 
5 for the intermediate views, the motion information, the strip size, and the strip 
shape. The result of this process is a realistic panorama. Steps 402, 403, and 404 
are optional, and incorporating any of these steps is a process covered by this 
invention. 

A detailed example of the method of the present invention will be 

10 given and applied to the very common case, in which the motion between every 
two successive images can be modeled as a two-dimensional affine motion. This 
covers most simple scenarios, and also zoom and forward motion in cases of 
planar scene, parallel to the image plane. Generated mosaics have minimal distor- 
tions compared to the original images, as no global scaling is performed. 

15 A possible geometric interpretation of the method of the invention 

will be given for general camera translation. This is done using a projection we 
call Pipe Projection. This Pipe projection can be used as an implementation of the 
proposed method when three-dimensional camera motion can be recovered. This 
interpretation of the method demonstrates the way strips can be collected and 

20 transformed, in such a way that complicated cases of oblique view can still be 
handled well by the proposed method. 

The suggested three-dimensional interpretation of the method is 
that images in a video sequence are transformed by an oblique projection of the 
image onto a viewing pipe whose central axis is defined by the trajectory of the 

25 camera. After this transformation the optical flow between frames becomes 
parallel, and the frames can be easily mosaiced along the viewing pipe, using 
simple cut and paste. The pipe mosaic generated this way includes most of the 
details observed by the moving camera, where each region is taken from that 
image where it was captured at highest resolution. Viewing this pipe mosaic from 
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various directions can give equivalent results to the various mosaics achieved 
using two-dimensional implementation. 

SHAPE OF STRIPS 

5 With no parallax and with pure image translation the construction 

of the panorama from the images is simple. Since over an overlap area between 
two images the alignment is very good, any selection of the particular image that 
will cover any given region is usually not critical The shape of the strip becomes 
important in cases without parallax mostly with image magnification like in the 

10 case of zoom. But with image parallax, alignment over an overlap area between 
images will not be perfect, and the selection of which image will cover an area in 
the panorama becomes critical. 

Fig. 5 shows the effects of parallax on the alignment and merging 
processes. Objects A, B, C, D, E are located on a planar surface at the top of the 

15 figure. Objects C, X, and Y induce parallax in the two input images Ii and I 2 
taken by a translating camera at Q and C 2 . Either objects C, X, or Y can be used 
as aligned regions, thus giving three different ways to create panoramic images, 
shown as Pi, P 2 and P 3 at the bottom of the figure. 

The mosaicing process can be presented as cutting "strips" from 

20 each image, and pasting those strips to a larger panorama. It will be shown that 
the type of camera motion determines the shape of these strips. This is in contrast 
to prior suggestions to use a "Voronoi Tessellation" to create the panoramic 
mosaic from the images, a suggestion that does not take into account at all the 
three-dimensional camera motion, but only the two-dimensional image 

25 displacement of the image centers. 

For example, better mosaicing will result if the boundaries of the 
strip are taken to be approximately perpendicular to the "optical flow" (local 
image displacement) generated by the camera motion. Examples are camera 
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translations: sideways motion, forward motion, and a general translation; as well 
as camera zoom. 

In sideways motion, as shown in Fig. 6, the strips can be linear 
strips at the center of the images. Given three input frames, 601, 602, and 603, 

5 panoramic mosaic 604 is generated by taking strip SI from image 601, strip S2 
from image 602, and strip S3 from image 603. The images are aligned so that 
region SI in image 601 matches region SI in image 602, region S2 in image 602 
matches regions S2 in images 601 and 603, and region S3 in image 603 matches 
region S3 in image 602. 

10 In the cases of forward motion and of zoom the strips cannot be 

. bound by straight lines. In these cases the strips are preferably circular, centered 
at the focus of expansion of the image. In the example shown in Fig. 7a the focus 
of expansion is located at the center of the image. Panoramic mosaic 704 will be 
created by "unfolding" strip SI from image 701, strip S2 form image 702, and 

15 strip S3 from image 703, and placing them adjacent to each other. 

When the strips are wide, "unfolding" them (by warping) will 
create a non-rectangular strip. Also, strips will not be aligned due to scale 
difference across seams. In this case each strip can be rescaled to a rectangular 
strip, thus giving the continuous panoramic mosaic 705 from panoramic 

20 mosaic 704. Such rescaling will improve alignment across seams. The place 
where the circle is "opened" before its unfolding is arbitrary, and possibly 
determined by the direction in which the panoramic mosaic is constructed. The 
constructed mosaic image can be considered as the surface area of a cylinder as 
will be described in greater detail below with reference to the three-dimensional 

25 interpretation of the method. 

It should be noted that the "unfolding" of the circular strips into 
straight strips might cause mosaic 705 to look distorted. It is expected that only 
sub-parts will be used from such mosaics, for example the part that relates to the 
top of the image or the part that relates to the left side of the image, etc. Such a 
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part will usually be a rectangular sub-strip of mosaic 705. Before such a part is 
displayed the mosaic can be rectified by "bending" its straight sides into arcs of a 
circle whose radius, for example, can be the outside radius of the original circular 
strip (e.g. strip S3 in image 703). 

5 The transformation that mapped strip SI in image 701 into strip SI 

in mosaic 705 turns radial optical flow, in image 701, into parallel optical flow, in 
image 705. If the same transformation is applied to the entire image 701, instead 
of just to strip SI, the transformed image will have the shape shown in Fig. 7b. 
As will be described in greater detail below, such transformations can be modeled 

10 by the three-dimensional interpretation using the projection onto a cylinder. 

The case of camera zoom is of special interest. Whilst zooming 
towards a distant scene, and mosaicing as in Fig. 7a will create a mosaic image 
with higher resolution in locations relating to the center of the image, the case of a 
camera viewing objects from the side is different. Assume the camera is located 

15 at the side of a very long wall, with the optical axis parallel to the wall. In this 
case the closest parts of the wall are seen in great detail at the edge of the image, 
while the distant parts of the wall are seen smaller closer to the center of the 
image. When zooming in, the further parts are magnified and get closer to the 
edge of the image, and the mosaic will therefore become a reconstruction of the 

20 wall at the highest possible resolution. Under some conditions the wall can even 
be reconstructed as viewed from the front, with uniform resolution all over. 

In a more general case of camera translation, shown in Fig. 8, there 
can be any angle between the direction of camera motion and the optical axis. In 
this example the optical flow can be radial from the focus of expansion, which is 

25 located somewhere outside the image, and therefore the preferred shape for the 
strip is a circular or an elliptic arc. Given three input frames, 801, 802, and 803, 
panoramic mosaic 804 is generated by taking strip SI from image 801, strip S2 
from image 802, and strip S3 from image 803. The images are aligned so that 
strip SI in image 801 matches strip SI in image 802, strip S2 in image 802 
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matches strips S2 in images 801 and 803, and strip S3 in image 803 matches strip 
S3 in image 802. 

The strips in the input images, like strip SI in image 801 and strip 
S2 in image 802, are bounded by arcs of concentric circles centered at the focus 

5 of expansion. Since the radii of the two circles are different, their curvatures are 
different, and the strips can not be pasted together without gaps forming between 
the edges of the strips. In order to achieve pasting the strips without the 
formation of gaps the strips are warped before pasting. 

Strip 810 displays an example of an original circular strip as cut 

10 from an input image. The radius r\ of left arc 811 is larger than the radius r 2 of 
right arc 812, which is closer to the focus of expansion. Strip 810 can be warped 
to yield strip 820, which has the following properties: arc 821 and arc 822 are 
both of radius r 7 ; the length of arc 821 is the same as the length of arc 811; the 
length of arc 822 is the length of arc 812 multiplied by rj/r 2 . This arrangement 

15 assures not only that the strips will fit without gaps, but also that features of the 
image will be resized properly for better alignment across the seam. 

Even though the above discussion on the shape of the strip assumes 
a uniform camera motion along a sequence, camera motion can change, affecting 
the shape of the strip. Assume, for example, a forward motion between frame I, 

20 and frame I 2 , and a sideways motion between frame I 2 and frame I 3 , The strip 
taken from frame I 2 can have a circular arc boundary on the side of frame I lf and a 
straight line boundary on the side of frame I3. 



EXAMPLE: MOSAICING FOR AFFINE MOTION 

25 An example of strip shaping for the special case of affine motion 

will now be described. Affine motion is based on an affine transformation and 
affords a good approximation for many types of motion. Based on the detailed 
description given below it will be apparent to a person skilled in the art that other 
types of motion can be dealt with in a similar manner. 
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The affine transformation can be expressed as follows: 







( Y -V ^ 




'a + bx„ + cy\ 













(1) 



10 



15 



where P n ^ = (x„ mJ , y n .j) and P n = (x m yj are the coordinates of corresponding 
points in images I^j and /„, and the parameters of the affine transformation A are 
(a, b, c, d, e,j). (u, v) is the optical flow vector as a function of position (x m y^. 
The transformation A (and the optical flow) vary continuously along the sequence 
of images. Numerous methods exist to recover the parameters of an affine 
transformation [21, 18] and they will not be described here. 

In accordance with the method of the present invention, in order to 
define the shape of a strip, it is required to find a line F(x,y) = 0 which is 
perpendicular to the optical flow. It should be noted that this line is not 
necessarily a straight line, and can be a curved line. The normal to the line F = 0 



is in the direction 



(dF dF_) 
{dx'dy)' 



and thus should be in the same direction as (u, v). 



This constraint can be expressed as follows: 

<3F^ 



dx 
dF_ 
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f a + bx + cy\ 






J + ex + Jy) 



(2) 



for some value of k. By integrating, when e = c we get the line equation: 

0 = F(x,y) = ax + dy + - x 2 + — xy + ^y+M 
2 2 2 



(3) 



Note that this line equation exists only when e = c. In most cases, 
the difference between the values of c and e is due to the rotation of the image 
20 around the optical axis by & (angle in radians) , such that it contributes to 
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c, and +a) to e. To approximately satisfy the condition e«c» it is 

2 

€ - C 

therefore possible to rotate the image about its center by & « after the affine 

2 

transformation is recovered, and then recompute the affine transformation. 

As a result, Equation 3 defines a family of lines that are all 
5 perpendicular to the optical flow. M is used to select a specific line. It is 
suggested that Af be set equal to the value for which the line contains a maximum 
number of pixels within the image. If many options exist, then it is suggested that 
a line be selected as close as possible to the center of the image so as to minimize 
lens distortions. This selection should ensure that pixels used in the mosaic will 
l o be from that image having the best resolution at that location. 

Equation 3 can be easily understood for some simple cases: 

(i) In the case of a uniform horizontal optical flow (either a small pan 
or a sideways translation of the camera), the affine transformation A takes 
the form A - (a, 0, 0, 0, 0, 0), thus the selected line 901 becomes 0 = F(x,y) 

15 = ax + M, which is a straight vertical line (See Fig. 9a). 

(ii) In the case of a uniform vertical optical flow (either a small tilt or a 
vertical translation of the camera), the affine transformation takes the form 
A = (0, 0 t 0, d, tt 0), thus the selected line 902 becomes 

0 = F(x,y) =dy + M, which is a straight horizontal line (See Fig. 9b). 
20 (iii) In the case of zoom or forward motion (towards a planar surface 

which is parallel to the image plane), the affine transformation takes the 
form A = (0, b t 0, 0, 0, J), where b is a scaling factor (f=b). As a result, 

the selected line 903 becomes 0=F(x,y)=£j(x 2 + y 2 ) + M , which is a 

circle around the center of the image 904 (see Fig. 9c). 
25 For general translations of the camera the line will be a circle 

around the focus of expansion. In more general cases the line may be 
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approximated by an elliptic curve 905: 0 « F(x, y) = ax + dy + - x 2 ' +- y cxy + M 

^ 2 

(see Fig. 9d). 

The mosaic is constructed by pasting together strips taken from the 
original images. The shape of the strip, and its width, depend on the image 
5 motion. An example will now described of how to determine these strips in the 
case of an affine motion to conform to the methodology of the selection of best 
resolution. Strip selection for other types of image motion can be performed in a 
similar manner. 

The following notation will be used to describe the strip collection 
. io along the sequence of images: the line F n (x„, - 0 is the line in image l m in its 
coordinate system, which is perpendicular to the optical flow described by the 
affine transformation A n = (a m bn, c m d m fj. This affine transformation A„ 
relates points p n in image I n to corresponding points /?„_/ in image 

In order to determine the strip to be taken from image I n , the 

15 preceding frame /„_/, and the succeeding frame /„+/, should be considered. Let A n 
be the affine transformation relating points p„ = (x m y^ in image I n to the 
corresponding points = (x^ h y n .j) in image and let A n +i be the affine 
transformation relating points p n +j = (x n+ ] t y n+ j) in image I n +i to the corresponding 
points p n = (x m yr) in image /„. 

20 Given the affine transformations A n and A n+I , the lines F n (x m y^ = 

0 and Fn+jfrn+j, y„ +I ) = 0 are selected respectively (see Figs. 10a to 10c). The 
line F„(x»yJ =0'ml n corresponds to the line F' n (x„ mJ , y n .j) = 0 in using the 
affine transformation A n . In the same way, the line F n+J (x n+J , y n +j) = 0 in I n + } 
corresponds to the line F' n + } (x» yj = 0ml n using the affine transformation A n +j. 

25 The strip that is taken from the image I n is bounded between the 

two lines F n (x n yJ = 0 and F' n+ /(3c w y J = 0 in /„ (see Figs. 10a to 10c). For 
. example, in Fig. 10b, line F2 is selected in image 12, and in Fig. 10c line F3 is 
selected in image 13. The mapping of line F3 (in 13) into image 12 using the 
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affine transformation is line F3\ Hence the strip S2 taken from image 12 in Fig. 
10b is bounded between lines F2 and F3\ It should be noted that the strips SI, 
S2, and S3 are perpendicular to the lines of optical flow 1001. 

Using this selection, the first boundary of the strip will be defined 

5 by the selected line F m thus will be exactly orthogonal to the optical flow with 
regard to the previous image. The second boundary of the strip is defined by the 
line F' n+I which is the projection of the line F n+} onto the current image /„, having 
the same property in the next image. 

This selection of the boundaries of the strip ensures that no 

10 information is missed nor duplicated along the strip collection, as the 
orthogonality to the optical flow is retained. 

Consider the common approach to mosaicing where one of the 
frames is used as a reference frame, and all other frames are aligned to the 
reference frame before pasting. In term of strips, the first strip is put in the 

15 panoramic image as is. The second strip is warped in order to match the 
boundaries of the first strip. The third strip is now warped to match the 
boundaries of the already warped second strip, etc. As a result, the mosaic image 
is continuous. However, major distortions may be caused by the accumulated 
warps and distortions. Large rotations cannot be handled, and cases such as 

20 forward motion or zoom usually cause unreasonable expansion (or shrinking) of 
the image. 

To create continuous mosaic images while avoiding accumulated 
distortions, it is proposed by this invention that the warping of the strips should 
depend only on the adjacent original frames, independent of the history of 
25 previous distortions. 

In accordance with the present invention, it is preferable that one 
side of each strip, e.g. the back side, is not being warped. This is the side of the 
strip that corresponds to the boundary between image /„_/ and image /„ and 
defined by F n . For example, in Fig. 10b, line F2 is the back of strip S2 of image 
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12. The front of the strip is warped to match the back side of the next strip. This 
is the boundary between image I n and image I n +j which is defined by F'„ +/ . For 
example, in Fig. 10b, the line F3' is the front of strip S2 of image 12. 

In the example described in Fig. lOd, the first strip SI is warped 

5 such that its left side (i.e., back side) 1002 does not change, while its right side 
(i.e., front side) 1003 is warped to match the left side of the original second strip 
S2. In the second strip S2, the left side 1004 does not change, while the right side 
1005 is warped to match the left side 1006 of the third strip S3, etc. 

As a result, the constructed image is continuous. Also, if the 

10 original optical flow is warped as by the same warping as that performed on the 
strips, the resulting flow will become approximately parallel to the direction in 
which the panoramic mosaic is constructed. Moreover, no accumulative 
distortions are encountered, as each strip is warped to match just another original 
strip, avoiding accumulative warps. 

15 

POSSIBLE THREE-DIMENSIONAL INTERPRETATION OF STRIP 
SHAPING 

In general camera motion, the optical flow is induced by camera 
translation and by camera rotation. The rotational part can be recovered and 
20 compensated for if needed, as it does not depend on the structure of the scene 
(see, for example, [17]). Camera translation (and zoom) induces radial optical 
flow emerging from the focus of expansion, except for the singular case of 
sideways translation in which the optical flow is parallel. 

Cases of radial optical flow are much more complicated for 
25 mosaicing since the optical flow is not parallel, and depends on the structure of 
the scene. 

In accordance with the present invention, an example of a possible 
three-dimensional interpretation of the proposed mosaicing method is presented. 
It is also possible to use the following description to implement the mosaicing 
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process proposed in this invention for cases in which the three-dimensional 
motion information is available, either from the images [21,17] or from external 
devices. The procedure of choosing curved strips which are approximately 
perpendicular to the optical flow and warping them to match each other when 

5 pasting, can be considered as transforming the video sequence of images by an 
oblique projection of the image onto a viewing pipe whose central axis is defined 
by the trajectory of the camera. After this transformation the optical flow 
between the projected images becomes approximately parallel to the central axis 
of the pipe, and they can be easily mosaiced using simple (traditional) strip cut 

10 and paste procedures along the pipe. The pipe mosaic generated this way 
includes most details observed by the moving camera, where each region is taken 
from that image where it was captured at highest resolution, thus forming a strip 
in that image. 

In order to define the projection onto the pipe, the following 
15 notation will be used: the letter O will be used to refer to the origin of two 
Cartesian coordinate systems having a joint origin. One coordinate system is a 
global coordinate system with axes denoted by X, Y t Z. The camera is located at 
the origin, and the image plane is located at Z = f c , where f c is the focal length. 
The other coordinate system defines the pipe, and will be described below. The 
20 position of a point P in three-dimensional space is given by its coordinates in 
either of the coordinate systems, for example P = (P x9 P y ,P g ) in the X Y, Z 

coordinate system. The vector OP will also be denoted by the letter P. 

Given a sequence of images taken by a translating camera, the 
method of the invention suggests that the images be transformed in such a way 
25 that the radial optical flow be turned into approximately parallel optical flow in 
the transformed representation. In order to achieve the required transformation 
the two-dimensional planar image is projected onto a three-dimensional cylinder, 
referred to herein as a "pipe" 1101 (see Fig. 11). The axis of the pipe 1102 is 
chosen to pass through the optical center O = (0, 0, 0) of the camera and through 
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the focus of expansion S = (j» Sy> jQ, where f c is the focal length. This axis is the 
trajectory from the current three-dimensional camera position towards the 
three-dimensional camera position in the next frame. The direction of the pipe's 

S 

axis is given by the unit vector s « — . Each image point P = (x, y, f c ), in image 

| S | 

5 plane 1103, is projected onto its corresponding point Q on the pipe. The point Q 
is collinear with O and P, and its distance from the pipe's axis 1102 is R (the 
radius of the pipe). 

In the pipe representation of the image, the optical flow of each 
corresponding point Q on the pipe is now approximately parallel to the direction 

10 of the pipe's axis S (1102). This enables a simple mosaicing process on the pipe 
itself, as subsequent images, after being projected on the pipe, need only be 
shifted along the pipe in order to become aligned with previous images. This 
translation along the pipe does not reduce the resolution, as commonly happens in 
mosaicing methods which are based on alignment with a reference frame. 

15 A pipe-fixed Cartesian coordinate system is defined by the three 

unit vectors r , d and s , where s is the unit vector in the direction of the pipe's 
axis and r and d are chosen to be perpendicular to each other and to s . 

Let the point L be the projection of the point Q on the axis 1102 of 
pipe 1101 and let k be the distance of L from O. The angle 2 designates the angle 

20 between the line joining L and Q and the unit vector d . Hence £ and Q determine 
the position of a point Q on pipe 1101. The three-dimensional position of a point 
Q on the pipe 1101, is given by the Cartesian components (g* Qy, QJ 9 which can 
be obtained from the components of the vector Q = ks + Rcos(a )d + Rsin(a )f , 
with respect to the pipe-fixed system. The corresponding pixel in image plane 

25 1103 for the point Q is P = (x f */ e ) = (f e Q x I Q z J c Q y IQ,Jc)- 

Pixels in the image plane 1103 whose original distance from the 
axis 1102 is less than R become magnified on the pipe, but when projected back 
to the image they restore their resolution. However, pixels with distance greater 
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than R shrink on the pipe, thus loosing their original resolution. For this reason, it 



is recommended to choose R to be equal to ^f 2 c + ^["jj ' w ^ ere w m ^ ^ 

are the width and height of the image, thus ensuring that no pixel will have 
reduced resolution when projected onto the pipe. Alternatively, in many simple 
5 scenarios it is enough to choose R to be equal to f c . 



by a simple translation (shift) along the pipe's principal axis, and the creation of 
the pipe mosaic involves taking the pixels with the best resolution among all 
projected images for every point on the pipe. It should be noted that other 

10 approaches to select the value for each point on the pipe could be used, including 
super resolution methods. The resolution is best preserved for pixels whose area 
when projected on the pipe is 1 by 1 pixels (meaning a single pixel is projected 
onto a single pixel on the pipe, without artificial scaling). Using this criteria, the 
ratio between the area consumed on the pipe and the area on the original image 

15 frame can be considered as a measure such that the resolution is preserved best 
when this ration is as close as possible to 1 . As a result, for each point on the pipe, 
its corresponding pixels in the images are considered, and the one with the ratio 
closest to 1 may be chosen for best resolution. As a rule of thumb, this ratio can 
be roughly approximated according to the ratio of the distances along the Z axis 

20 Q z I f c , which should be as close as possible to 1. Using this approximated 
measure, pixels on the image at the intersection of the pipe with the image (Q z - 
f c ) are considered as best preserving the resolution, and the resolution preservation 
decreases according to I Q z -f c I . For every point on the pipe the image values 
(e.g. color and intensity) will be taken from the image in which the value of i Q z - 

25 f c | is minimal, thus having best resolution preservation. This definition forms a 
strip in every image, which is the region in which this image best maintains the 
resolution when projected on the pipe, compared to the corresponding regions in 
other images (See Fig. 12a). 




In the pipe representation, pipe images are aligned with each other 
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This pipe representation proposes a generalized interpretation also 
for the traditional mosaicing methods. Methods based on alignment to a 
reference frame can be simulated by viewing the pipe from the same orientation 
as the selected reference frame. Methods which are limited to pure sideways 

5 translation will give identical results as using a pipe mosaic, where the images are 
projected on the side of the pipe. 

Cases like oblique view, forward motion, and zoom, can be well 
defined using the pipe projection, and give optimal results, while previous 
mosaicing methods may fail in these cases. The mosaicing process covered by 

10 this invention uses generalized strips (having their shape, size, and warping 
process determined according to the motion, and resolution considerations), and 
may be interpreted by the above description of pipe mosaicing, thus generalizing 
the known methods to work for the problematic cases as well. 
The pipe representation can be generalized for handling complicated trajectories 

15 and rotations by concatenation of pipes along the path of the camera (See 
Fig. 12b). 

STRIP WIDTH IN THREE-DIMENSIONAL REPRESENTATION 

When the three-dimensional camera motion T = (T X) T Y , T z ) and 
20 =( Xf y> z) (translation and rotation) is available from external devices, or 
by using algorithms for camera motion recovery from the images [21,17], then 
either of these could be used for setting the size of the strips. 

Following the description of the mosaicing process using the 
"pipe", the projections of two images onto the pipe can be aligned with each other 
25 by simple shift along the pipe's axis. Shifting the projected image by L pixels can 
form a strip with a width of L pixels. A method to approximate the width, I, of a 
strip for two input frames will now be described. 
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Note that it is assumed in this section that the pipe's radius is 
chosen to be R = f c , although other values of R are possible, and the value of L 
may be scaled accordingly. 

It is required to compute the width of the strip, L, in such a way that 
5 the resolution of the resulting panoramic image will be no less than the resolution 
of the original sequence of images. For example, without parallax, the width of 
the strip can be equal to image displacement between the two frames. 

Fig. 13 shows the choice of strip width required to preserve the 
original resolution for the case of pure rotation. The width of the strip I from the 
io center of image Ii to the center of I 2 can be set to 
L = \Qx(0,0,f c f\ = f c ylo, 2 x + Cl 2 y > where f c is the focal length of the camera (or 

the pipe's radius), x is the cross product operator, and () l is the transpose 
operator. This will give similar results as in other panoramic mosaicing methods 
restricted to pure rotation. 

15 Fig. 14 shows the choice of strip width required to preserve the 

original resolution for the case of pure translation. In the case of pure translation, 
it would be best if the result has the same effect as that of orthographic projection 
(parallax independent). It is therefor suggested that the resolution of the resulting 
image is considered in such a way that all objects whose distance from the camera 

20 is at least will maintain or improve their resolution. can be defined 
according to the application, and in general, corresponds to the closest object, 
having the largest image (or pipe) displacement. 

For example, Fig. 14 describes a scene with objects that are not 
closer to the camera than some distance Zmin- An object of length M, will have at 

25 most m = f c M /Z^ n pixels in the image plane, where f c is the focal length of the 

camera. Consider a camera translating with \T\ = M , such that the center of the 

first image looks at one end of the object, and the center of the second image 
looks at the other end of the object. This means that the camera has just passed 
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that object from one end to the other, thus L = m pixels are required in between in 
order to preserve the object's original resolution in the panoramic image. As a 
result, the following the definition L = f c \T\lZ mhX{ for the width of a strip is 

proposed for the case of pure translation, where f c is the focal length (or the pipe's 
5 radius). A strip whose width is at least L pixels can be used for the creation of the 
panoramic image (or some narrower strips, such as L strips from L intermediate 
views, where each is one pixel wide, as will described later). This definition can 
cause all objects at a distance Z > Z min to have better resolution than in the 
original sequence of images. 
10 In the case of general motion, the width of the strip L between I] 

and I 2 can be directly determined from f c (the focal length), T (the translation 
vector) and □ (the rotation vector). For example, the following equation can be 
used: 

W c |r/z^ n +Qx(o,o,i)'| 

15 Note that T and Z can usually be recovered only up to a scale 

factor, but the relation between them can be recovered uniquely. The term 
/ C |r|/Z min defines the maximum magnitude of optical flow induced by camera 

translation, which is recoverable. This definition does not depend on any one 
specific region in the image, and depends only on the camera motion parameters, 
20 thus it is consistent along the sequence of images, and enables the creation of 
realistic panoramic mosaics. 
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MOSAICING USING NEW VIEW GENERATION 

In order to create a manifold projection, the images are considered 
to be a one-dimensional (not necessarily linear) scan of the scene, which is a 
collection of strips that are approximately perpendicular to the direction of the 

5 camera motion. 

Taking strips from different images with strip widths of more than 
one pixel works fine only if there is no parallax. For the general case that 
includes parallax, instead of taking a strip with a width of £ pixels, intermediate 
images can be synthetically generated, and narrower strips can be used. For 

io example, a collection of L strips, each with a width of one pixel, can be taken 
from interpolated views in between the original camera positions. 

Fig. 15 shows the generation of a panoramic image using view 
interpolation by generating synthetic views from intermediate camera positions 
for two examples, one for the case of translation, and one for the case of rotation. 

15 In the case of translation, the objects A, B, X, Y, C, D are viewable in the two 
subsequent frames Ii and I 2 , taken by a camera which is translating from position 
Ci to position C 2 . All intermediate images required are recovered, for the in 
between views N h N 2 , and a single strip (one pixel wide) is taken from each 
intermediate image. The process of generating these intermediate views, and 

20 collecting of these strips gives as a result the panoramic mosaic Pj. This 
panorama is realistic, and does not suffer from parallax effects. 

The same mechanism applies also for the case of rotation. Here, the 
objects E, F, W, Z, L, M are viewable in the two subsequent frames I 3 and I 4 , 
taken by a camera whose location is fixed, and whose orientation changes from 

25 C3 to C4. All intermediate images required are recovered for the in between 
views N 7 , Ng, and a narrow strip is taken from each intermediate image. The 
result of this process is the panoramic mosaic P 2 . This panorama is as good as the 
panorama created by some previous methods, as no parallax effects are 
encountered in pure rotation. 
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In order to synthesize new views various known methods can be 
used, such as Optical Flow interpolation [6, 19], Trilinear tensor methods [17], 
and others. In most cases approximate methods will give good results. The 
creation of the intermediate views should require only view interpolation, since in 
5 most applications view extrapolation is not required. 

The use of intermediate views for strip collection gives the effect of 
orthographic projection, which avoids parallax discontinuities. For example, Fig. 
16 shows the generation of consistent panoramic mosaics in the presence of 
parallax. The method described above overcomes the difficulties of parallax 
10 using view interpolation, and the result remains realistic. 

Although the present invention has been described to a certain 
degree of particularity, it should be understood that various alterations and 
modifications could be made without departing from the spirit or scope of the 
invention as hereinafter claimed. 

15 
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CLAIMS: 

1. A method for combining a sequence of two-dimensional images of 
a scene to obtain a panoramic mosaic of said scene, said sequence of 

5 two-dimensional images being acquired by a moving camera in relative motion 
with respect to said scene, said camera having an optical center, the camera 
motion giving rise to optical flow between the images, the method comprising the 
step of wrapping the images; 

pasting the images into the panoramic image, 

10 such that the optical flow becomes substantially parallel to the direction in 

which the mosaic is constructed. 

2. A method for combining a v sequence of two-dimensional images of 
a scene to obtain a panoramic mosaic of said scene, said sequence of 
two-dimensional images being acquired by a moving camera in relative motion 

15 with respect to said scene, said camera having an optical center, the camera 
motion giving rise to optical flow between the images, the method comprising the 
steps of: 

(a) selecting for each image of said sequence at least one strip such that 
each strip is substantially perpendicular to said optical flow; said strips having a 
20 front edge and a back edge with the optical flow entering a strip through the front 
edge and exiting the strip through the back edge; and 

(b ) pasting together said strips from adjacent to construct a panoramic 
mosaic. 

3. The method according to Claim 2, further comprising the step of: 

25 (a') wrapping the front edge of a strip defined on a two-dimensional so 

that it is substantially aligned with the back edge of a strip defined on an adjacent 
two-dimensional image. 
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4. The method according to Claim 1, wherein the strips are 
transformed by warping into strips having edges of arbitrary shape before the 
strips are pasted together. 

5. The method according to Claim 1, wherein the strips are 
5 transformed by warping into strips having straight edges before the strips are 

combined together. 

6. The method according to Claim 2, wherein the strips are 
transformed by warping into strips having edges of arbitrary shape before the 
strips are pasted together. 

10 7. The method according to Claim 2, wherein the strips are 

transformed by warping into strips having straight edges before the strips are 
combined together. 

8. The method according to Claim 1, wherein the two-dimensional 
images are related by an affine transformation or by a planar-projective 

15 transformation. 

9. The method according to Claim 2, wherein the two-dimensional 
images are related by an affine transformation or by a planar-projective 
transformation. 

10. The method according to Claim 1, wherein said images are 
20 projected onto a three-dimensional cylinder whose major axis approximates the 

path of the camera centers of said images, the combination of the strips is 
achieved by translating the projected two-dimensional images substantially along 
the cylindrical surface of the three-dimensional cylinder. 

11. The method according to Claim 2, wherein said images are 
25 projected onto a three-dimensional cylinder whose major axis approximates the 

path of the camera centers of said images, the combination of the strips is 
achieved by translating the projected two-dimensional images substantially along 
the cylindrical surface of the three-dimensional cylinder. 
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12. The method according to Claim 10, wherein every two subsequent 

images define their own cylinder whose major axis substantially passes through 
the centers of the cameras of said images, and the cylinders are concatenated 
substantially along the image sequence. 
5 13. The method according to Claim 1 1, wherein every two subsequent 

images define their own cylinder whose major axis substantially passes through 
the centers of the cameras of said images, and the cylinders are concatenated 
substantially along the image sequence 

14. The method according to Claim 1, wherein a transformation is 
10 applied to the panoramic mosaic depending on a desired viewpoint. 

15. The method according to Claim 2, wherein a transformation is 
applied to the panoramic mosaic depending on a desired viewpoint. 

16. The method according to Claim 1, wherein the sequence of images 
is augmented by sets of interpolated images intermediate to the images of the 

15 sequence of images, and wherein the strips are augmented with strips defined on 
the interpolated images. 

17. The method according to Claim 2, wherein the sequence of images 
is augmented by sets of interpolated images intermediate to the images of the 
sequence of images, and wherein the strips are augmented with strips defined on 

20 the interpolated images. 

18. A system for combining a sequence of two-dimensional images of 
a scene to obtain a panoramic mosaic of said scene, said sequence of 
two-dimensional images being acquired by a moving camera in relative motion 
with respect to said scene, said camera having an optical center, the camera 

25 motion giving rise to optical flow between the images, the system comprising: 
wrapper for wrapping the images; 

paster for pasting the images into the panoramic image, such that the 
optical flow becomes substantially parallel to the direction in which the mosaic is 
constructed. 
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19. A system for combining a sequence of two-dimensional images of 
a scene to obtain a panoramic mosaic of said scene, said sequence of 
two-dimensional images being acquired by a moving camera in relative motion 
with respect to said scene, said camera having an optical center, the camera 

5 motion giving rise to optical flow between the images, the system comprising: 
(a) selector for selecting for each image of said sequence at least one strip 
such that each strip is substantially perpendicular to said optical flow; said strips 
having a front edge and a back edge with the optical flow entering a strip through 
the front edge and exiting the strip through the back edge; and 

10 (b ) paster for pasting together said strips from adjacent images in such a 

way that the front edge of a strip defined on an image is substantially aligned with 
the back edge of a strip defined on an adjacent image. 

20. A memory containing a file representing a panoramic mosaic of a 
scene generated by the method of Claim 1 . 

15 21. A memory containing a file representing a panoramic mosaic of a 

scene generated by the method of Claim 2. 
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